The abundance of data has given machine learning considerable momentum in natural sciences and engineering, though modeling of physical processes is often difficult. A particularly tough problem is the efficient representation of geometric boundaries. Triangularized geometric boundaries are well understood and ubiquitous in engineering applications. However, it is notoriously difficult to integrate them into machine learning approaches due to their heterogeneity with respect to size and orientation. In this work, we introduce an effective theory to model particle-boundary interactions, which leads to our new Boundary Graph Neural Networks (BGNNs) that dynamically modify graph structures to obey boundary conditions. The new BGNNs are tested on complex 3D granular flow processes of hoppers, rotating drums and mixers, which are all standard components of modern industrial machinery but still have complicated geometry. BGNNs are evaluated in terms of computational efficiency as well as prediction accuracy of particle flows and mixing entropies. BGNNs are able to accurately reproduce 3D granular flows within simulation uncertainties over hundreds of thousands of simulation timesteps. Most notably, in our experiments, particles stay within the geometric objects without using handcrafted conditions or restrictions.
translated by 谷歌翻译
Mixup is a popular data augmentation technique for training deep neural networks where additional samples are generated by linearly interpolating pairs of inputs and their labels. This technique is known to improve the generalization performance in many learning paradigms and applications. In this work, we first analyze Mixup and show that it implicitly regularizes infinitely many directional derivatives of all orders. We then propose a new method to improve Mixup based on the novel insight. To demonstrate the effectiveness of the proposed method, we conduct experiments across various domains such as images, tabular data, speech, and graphs. Our results show that the proposed method improves Mixup across various datasets using a variety of architectures, for instance, exhibiting an improvement over Mixup by 0.8% in ImageNet top-1 accuracy.
translated by 谷歌翻译
Biometrics are one of the most privacy-sensitive data. Ubiquitous authentication systems with a focus on privacy favor decentralized approaches as they reduce potential attack vectors, both on a technical and organizational level. The gold standard is to let the user be in control of where their own data is stored, which consequently leads to a high variety of devices used. Moreover, in comparison with a centralized system, designs with higher end-user freedom often incur additional network overhead. Therefore, when using face recognition for biometric authentication, an efficient way to compare faces is important in practical deployments, because it reduces both network and hardware requirements that are essential to encourage device diversity. This paper proposes an efficient way to aggregate embeddings used for face recognition based on an extensive analysis on different datasets and the use of different aggregation strategies. As part of this analysis, a new dataset has been collected, which is available for research purposes. Our proposed method supports the construction of massively scalable, decentralized face recognition systems with a focus on both privacy and long-term usability.
translated by 谷歌翻译
Gaussian process training decomposes into inference of the (approximate) posterior and learning of the hyperparameters. For non-Gaussian (non-conjugate) likelihoods, two common choices for approximate inference are Expectation Propagation (EP) and Variational Inference (VI), which have complementary strengths and weaknesses. While VI's lower bound to the marginal likelihood is a suitable objective for inferring the approximate posterior, it does not automatically imply it is a good learning objective for hyperparameter optimization. We design a hybrid training procedure where the inference leverages conjugate-computation VI and the learning uses an EP-like marginal likelihood approximation. We empirically demonstrate on binary classification that this provides a good learning objective and generalizes better.
translated by 谷歌翻译
破产预测的模型在几种现实世界情景中很有用,并且基于结构化(数值)以及非结构化(文本)数据,已经为任务提供了多个研究贡献。但是,缺乏常见的基准数据集和评估策略阻碍了模型之间的客观比较。本文基于新颖和已建立的数据集为非结构化数据方案介绍了这样的基准,以刺激对任务的进一步研究。我们描述和评估几种经典和神经基线模型,并讨论不同策略的好处和缺陷。特别是,我们发现基于静态内域字表示的轻巧的单词袋模型可获得令人惊讶的良好结果,尤其是在考虑几年中的文本数据时。这些结果进行了严格的评估,并根据数据的特定方面和任务进行了讨论。复制数据的所有代码,将发布实验结果。
translated by 谷歌翻译
无源域的适应性(SFDA)旨在通过仅使用预训练的源模型将分类器调整为未标记的目标数据集。但是,缺乏源数据和域移动使目标数据对目标数据的预测不可靠。我们建议量化源模型预测中的不确定性,并利用它来指导目标适应。为此,我们通过在网络参数上合并先验,构建一个概率源模型,从而在模型预测上诱导分布。通过采用拉普拉斯近似值来估算不确定性,并合并以识别不在源歧管中的目标数据点并在最大化目标数据上的共同信息时减少重量。与最近的作品不同,我们的概率处理是计算轻量级,脱离源训练和目标适应,并且不需要专门的源培训或模型体系结构的更改。我们显示了不确定性引导的SFDA比封闭设置和开放式设置中的传统SFDA的优势,并提供了经验证据,即即使没有调整,我们的方法对于强大的域转移也更为强大。
translated by 谷歌翻译
TensorFlow GNN(TF-GNN)是张量曲线的图形神经网络的可扩展库。它是从自下而上设计的,以支持当今信息生态系统中发生的丰富的异质图数据。Google的许多生产模型都使用TF-GNN,最近已作为开源项目发布。在本文中,我们描述了TF-GNN数据模型,其KERAS建模API以及相关功能,例如图形采样,分布式训练和加速器支持。
translated by 谷歌翻译
尽管扩散模型在图像生成中表现出了巨大的成功,但它们的噪声生成过程并未明确考虑图像的结构,例如它们固有的多尺度性质。受扩散模型的启发和粗到精细建模的可取性,我们提出了一个新模型,该模型通过迭代反转热方程式生成图像,当在图像的2D平面上运行时,PDE局部删除了细尺度信息。在我们的新方法中,正向热方程的解被解释为有向图形模型中的变异近似。我们展示了有希望的图像质量,并指出了在扩散模型中未见的新兴定性特性,例如在神经网络可解释性的图像和各个方面的整体颜色和形状分解。对自然图像的光谱分析将我们的模型定位为扩散模型的一种双重偶,并揭示了其中的隐式感应偏见。
translated by 谷歌翻译
这是普遍且观察到的,但知之甚少,两个在训练过程中具有相似性能的机器学习模型可能具有非常不同的现实性能特征。这意味着模型内部的难以捉摸的差异,表现为表示多样性(RM)。我们引入了一种概念性和实验设置,用于分析RM,并表明某些训练方法系统地导致RM比其他训练方法更大,这是通过通过单数矢量规范相关分析(SVCCA)激活相似性来衡量的。我们将其进一步与通过I.I.D的方差衡量的预测多样性相关联。在四个通用图像数据集中,分布外测试集预测。我们呼吁模型中的RM系统测量和最大暴露,而不是消除RM。诸如我们的炮板分析之类的定性工具可以促进与利益相关者的RM效应的理解和交流。
translated by 谷歌翻译
了解任务学习后神经电路中的活动如何重新成像,可以揭示学习的基本机制。由于神经成像技术的最近进步,高质量的记录可以在多天甚至几周内从数百个神经元获得。然而,人口响应的复杂性和维度对分析构成了重大挑战。研究神经元适应和学习的现有方法通常对数据或模型产生强烈的假设,导致不概括的偏置描述。在这项工作中,我们使用一个叫做 - Cycleangan的深度生成模型的变种,了解预先和后学后神经活动之间的未知映射,记录了$ \ texit {vivo} $。我们开发一个端到端的管道到预处理,火车和评估荧光信号,以及解释所得到的深度学习模型的过程。为了评估我们方法的有效性,我们首先在具有已知地面实话转换的合成数据集中测试我们的框架。随后,我们将我们的方法应用于从初级视觉皮层记录的表现小鼠记录的神经活动,其中小鼠从新手转换到基于视觉的虚拟现实实验中的专家级性能。我们评估了产生的钙信号的模型性能及其推断的尖峰列车。为了最大限度地提高性能,我们推导了一种新的预选神经元方法,使得基于卷积的网络可以利用神经活动中存在的空间信息。此外,我们还纳入了视觉解释方法,以提高我们工作的可解释性,并进入学习过程中的洞察力,表现在细胞活动中。我们的结果表明,分析具有数据驱动的深度无监督方法的神经元学习过程,其可能以不偏不倚的方式解开变化的可能性。
translated by 谷歌翻译